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Abstract 

This paper introduces a new maximum likelihood (ML) solution for the code-aided (CA) timing recovery 
problem in square-QAM transmissions and derives, for the very first time, its CA Cramer-Rao lower bounds 
(CRLBs) in closed-form expressions. By exploiting the full symmetry of square-QAM constellations and further 
scrutinizing the Gray-coding mechanism, we express the likelihood function (EE) of the system explicitly in terms 
of the code bits’ a priori log-likelihood ratios (LLRs). The timing recovery task is then embedded in the turbo 
iteration loop wherein increasingly accurate estimates for such LLRs are computed from the output of the soft- 
input soft-output (SISO) decoders and exploited at a per-turbo-iteration basis in order to refine the ML time delay 
estimate. The latter is then used to better re-synchronize the system, through feedback to the matched filter (ME), 
so as to obtain more reliable symbol-rate samples for the next turbo iteration. In order to properly benchmark the 
new CA ML estimator, we also derive for the very first time the closed-form expressions for the exact CRLBs 
of the underlying turbo synchronization problem. Computer simulations will show that the new closed-form 
CRLBs coincide exactly with their empirical counterparts evaluated previously using exhaustive Monte-Carlo 
simulations. They will also show unambiguously the remarkable performance improvements of CA estimation 
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against the traditional non-data-aided (NDA) scheme; thereby highlighting the potential performance gains in time 
synchronization that can be achieved owing to the decoder assistance. Over a wide range of practical SNRs, 

CA estimation becomes even equivalent to the completely data-aided (DA) scheme in which all the transmitted 
symbols are perfectly known to the receiver. Moreover, the new CA ML estimator almost reaches the underlying 
CA CRLBs, even for small SNRs, thereby confirming its statistical efficiency in practice. It also enjoys significant 
improvements in computational complexity as compared to the most powerful existing ML solution, namely the 
combined sum-product and expectation-maximization (SP-EM) algorithm. 

I. Introduction 

I N order to provide high quality of service while satisfying the ever-increasing demand in high 
data rates, the use of powerful error-correcting codes in conjunction with high-spectral-efficiency 
modulations is advocated. Indeed, turbo codes along with high-order quadrature amplitude modulations 
(QAMs) are two key features of current and future wireless communication standards such as 4G long¬ 
term evolution (LTE), LTE-advanced (ETE-A) and beyond (ETE-B) [1, 2]. As a crucial task in any 
digital receiver [3], accurate time synchronization remains a challenging problem especially for turbo- 
coded systems since they are intended to operate at very low signal-to-noise ratios (SNRs). In fact, the 
widespread adoption of turbo codes is in part fueled by their ability to operate in the near-Shannon 
limit even under such adverse SNR conditions [4]. Yet, the salutary performance of these powerful 
error-correcting codes is prone to severe degradations if the system is not accurately synchronized in 
time, phase or frequency. The goal of time synchronization, in particular, consists in estimating and 
compensating for the unknown time delay introduced by the channel so as to provide the decision device 
with symbol-rate samples of lowest possible inter-symbol interference (ISI) corruption [3]. 

The problem of timing recovery for linearly-modulated transmissions has been heavily investigated over 
the last few decades. A plethora of time delay estimators (TDEs) have been introduced in the open 
literature and the vast majority of existing TDEs are intended to operate with complete unawareness of 
the code structure (see [5-14] and references therein). In other words, the TD estimate is acquired just 
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after oversampling the eontinuous-time signal and then provided to a diserete-time MF in order to output 
the symbol-rate samples. The latter are then used by the turbo decoder, once for all, to decode the data 
bits. Therefore, the fact that a large portion of the data bits is to become almost perfectly known (i.e., 
correctly decoded) is systematically ignored by those estimators. The latter are referred to as non-code- 
aided (NCA) or simply NDA TDEs since no a priori knowledge about the transmitted symbols is used 
during the estimation process and, as such, they suffer from severe performance degradations under harsh 
SNR conditions. Being more accurate and usually less computationally expensive, DA methods require, 
however, the regular transmission of a completely known (i.e., pilot) sequence thereby limiting the whole 
throughput of the system. 

It sounds reasonable then to conceive a third alternative as a middle ground between these two extreme 
NDA and DA estimation schemes. Indeed, rather than relying on perfectly known or completely unknown 
symbols, CA estimation takes advantage of the soft information delivered by the decoder at each turbo 
iteration. In plain English, the decoder assistance is called upon in an attempt to enhance the timing 
recovery capabilities yet with no impact on the spectral efficiency of the system. In fact, from one turbo 
iteration to another, more refined soft information about the conveyed bits are delivered by the two 
soft-input soft-output (SISO) decoders. These are i) the a posteriori EERs of the code bits and ii) their 
extrinsic information. According to the turbo principle, the latter are iteratively exchanged between the 
two SISO decoders until achieving a steady state whose a posteriori EERs are used as decision metrics for 
data detection. In a nutshell, CA estimation consists simply in leveraging those soft outputs, by embedding 
the timing recovery task into the decoding process, in an attempt to enhance the estimation performance 
and vice versa. In the context of timing, phase, and frequency recovery, such CA estimation scheme is 
usually referred to as turbo synchronization [26]. A number of CA timing recovery algorithms have been 
proposed over the last decade [15-27] and, to the best of the authors’ knowledge, only two approaches 
are derived from ME theory. The first one [19] is based on the well-known expectation maximization 
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(EM) algorithm whereas the seeond [25] is a combined sum-product (SP) and EM algorithm approach 
(i.e., an improvement of [19]). The SP-EM-based ME estimator offers indeed significant performance 
improvements over the original EM-based estimator but at the cost of increased computational complexity. 
In the SP-EM-based ME approach, an EM iteration loop is required in each turbo iteration wherein the 
algorithm stwitches between the so-called expectation step (E-STEP) and maximization step (M-STEP). 
Roughly speaking, in each turbo iteration, the algorithm performs the following main four steps for each 
EM iteration: 

• Obtain new symbol-rate samples (via ME) using the TD estimate of the previous EM iteration; 

• Update the symbols’ a posteriori probabilities (APoPs) using those new symbol-rate samples; 

• Marginalize empirically the conditional (on the transmitted symbols) likelihood function with respect 
to those APoPs (E-STEP) ; 

• Maximize the marginalized EE with respect to the working TD variable (M-STEP). 

At the convergence of the EM algorithm, the obtained TD estimate is used to acquire new ISTreduced 
(symbol-rate) samples which will serve as input for the next turbo iteration where all the aforementioned 
EM-related steps are repeated. 

In this paper, we re-consider the problem of CA time synchronization from both the “performance bounds” 
and “algorithmic” point of views. By exploiting the full symmetry of square-QAM constellations and 
further scrutinizing the Gray coding mechanism, we are able to derive a closed-form and very simple 
expression for the system’s EE. Typically, marginalization of the conditional EE with respect to transmitted 
symbols is carried-out analytically and the a priori EERs of the elementary code bits are explicitly 
incorporated in the EE expression. We propose thereof a more systematic framework to their direct 
integration in the CA estimation process, thereby eliminating completely the need for the EM iteration 
loop under each turbo iteration. In other words, the new EE needs to be maximized only once per-turbo 
iteration (contrarily to SP-EM) after being updated by the associated a priori EERs which are computed 
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from the output of the SISO deeoders. Consequently, the proposed CA timing recovery algorithm offers 
significant improvements in computational complexity as compared to the existing SP-EM. As a matter 
of fact, the new algorithm is 35 and 70 times less computationally complex than SP-EM for 64 and 256 
QAMs, respectively. It also enjoys an advantage in terms of estimation accuracy for low SNR levels and 
higher-order modulations. 

Erom the “performance bounds” point of view, we also tackle the analytical derivation of the stochastic 
CREBs for the underlying CA estimation problem. Actually, unlike many other loose bounds, the 
stochastic^ CREB is a fundamental lower bound that reflects the actual achievable performance in practice 
[36]. Yet, even under uncoded transmissions, the complex structure of the EE makes it extremely hard, 
if not impossible, to derive analytical expressions for this practical bound, especially with high-order 
modulations. Therefore, in the specific context of timing recovery, the stochastic CREBs were previously 
evaluated using exhaustive Monte-Carlo simulations (i.e., empirically) in [30] and [29] for both NCA 
and CA estimations, respectively. Just recently though were their analytical expressions established [31] 
but only in the NCA (i.e., NDA) case. 

In this paper, we succeed in factorizing the EE of the coded system as the product of two analogous 
terms involving two random variables that are almost identically distributed, i.e., their probability density 
functions (pdfs) have the same expression but parametrized differently. We then capitalize on this 
interesting property to derive, for the very first time, the closed-form expressions for the TD CA CREBs 
from arbitrary turbo-coded square-QAM-modulated transmissions. The new closed-form expressions 
corroborate the previous attempts reported in [29] to evaluate the TD CA CREBs empirically and offer a 
way to their immediate evaluation in practice. Moreover, as will be shown later, the previously published 
closed-form NDA CREBs [31] boil down to a very special case of the new closed-form CA CREBs by 
simply setting all the code bits’ a priori EERs to zero. 

*In linearly-modulated transmissions, the stochastic model refers to estimation under the assumption of unknown and random transmitted 
symbols. This to be opposed to the deterministic model wherein the symbols are assumed to be unknown but not random [5]. 
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The rest of this paper is struetured as follows. In seetion II, we present the system model. In section III, 
we derive the expression of the log-likelihood function (LLF) and express it explicitly as function of the 
coded bits’ a priori LLRs. in Section IV, we establish the new closed-form expressions for the TD CA 
CRLBs. In Section V, we introduce the new CA ML time delay estimator. In Section VI, we discuss 
the simulation results of the proposed CA ML estimator and closed-form CRLBs. Finally, we draw out 
some concluding remarks in section VII. 

We also mention beforehand that some of the common notations will be used in this paper. Vectors and 
matrices are represented in lower- and upper-case bold fonts, respectively. Itv and Oat denote, respectively, 
the N X N identity matrix and the A^—dimensional all-zero vector. The shorthand notation x ~ A/'(m, R) 
means that the vector x follows a normal (i.e., Gaussian) distribution with mean m and auto-covariance 
matrix R. Moreover, {.}^ and {.}^ denote the transpose and the Hermitian (transpose conjugate) 
operators, respectively. The operators 3?{.} and 9{.} return, respectively, the real and imaginary parts of 
any complex number. The operators {.}* and |.| return its conjugate and its amplitude, respectively, and j 
is the pure complex number that verifies = —1. The Kronecker and Dirac delta functions are denoted, 
respectively, as 6m,n and 5{t). We will also denote the probability mass function (PMF) for discrete 
random variables (RVs) by P[.] and the pdf for continuous RVs by p[.] The statistical expectation is 
denoted as E{.} and the notation = is used for definitions. 

IT System Model 

Consider a turbo-coded system where a binary sequence of information bits is fed into a turbo encoder 
consisting of two identical recursive and systematic convolutional codes (RSCs) which are concatenated 
in parallel via an inner interleaver IIi. The resulting code bits are fed into a puncturer which selects an 
appropriate combination of the parity bits, from both encoders, in order to achieve a desired code rate 
R. The entire code bit sequence is then scrambled with an outer interleaver, 112, and divided into K 
subgroups of 2p bits each for some integer p > 1. The subgroup of code bits, ■ ■ - bf ■ ■ ■ b^^, is 
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conveyed by a symbol a{k) that is seleeted from a fixed alphabet, Cp = {cq, ci, • • • , cm-i}, of a M—ary 
(with M = 2‘^P) QAM constellation (i.e., square-QAM). In faet, eaeh point, e Cp, is mapped onto a 
unique sequence of log 2 (M) = 2p bits denoted here as • • ■ 6™ • ■ ■ 6^, according to the Gray coding 
mechanism, and the point is seleeted to eonvey the eode bits subgroup [i.e., a{k) = c^] if and 
only if = fe™ for I = 1, 2, • • • ,2p. We also define the a priori LLR of the eode bit, , eonveyed 
by a{k) as follows: 


Li{k) ^ 




( 1 ) 


Using (1) and the faet that = 0] + P[6f = 1] = 1, it ean be easily shown that: 


P[6f = 1] = 


1 + ePW 


and P[h1 = 0] = 


1 + 


( 2 ) 


For mathematieal eonvenience, the two identities in (2) ean be merged together to yield the following 
common generic expression: 


p[6f = 6r]=' 


AbT-i)- 


(3) 


2 cosh (L;(/c)/2) ’ 

in whieh fe™ is either 0 or 1 depending on whieh of the symbols is transmitted, at time instant k, 
and of eourse on the Gray mapping assoeiated to the eonstellation. The obtained information-bearing 
symbols, {a(fc)}^i, are then pulse-shaped and the resulting eontinuous-time signal: 


K 


x(t) = y; a{k)h{t-kT), 


(4) 


k=l 


is transmitted over the eommunieation ehannel with T being the symbol duration and h{t) a unit-energy 
square-root shaping pulse. Being completely unknown to the reeeiver a priori, the transmitted symbols 
{a{k)}k are drawn from a given M-ary Gray-eoded (GC) square-QAM eonstellation whose alphabet is 
denoted as Cp = {cq, Ci, • • • , cm-i}- Here, by square QAM we mean M = 2^^’ for some integer p > 1 
(i.e., QPSK, 16-QAM, 64—QAM, ete...). The Nyquist pulse g{t) obtained from h{t) is defined as: 


9{t) = 


'•- 1-00 


h{x)h(t + x)dx, 


(5) 







and satisfies the first Nyquist criterion [3]: 


g{nT) = 0, for any integer n 7 ^ 0. ( 6 ) 

At the receiver side, assuming perfect frequency and phase synchronizations, the (delayed) continuous¬ 
time received signal before matched filtering is expressed as: 

y{t) = \/Ws x{t - r) + w{t), (7) 


where Es is the transmit signal energy and r is the unknown time delay parameter to be estimated. 
Moreover, w{t) is a proper complex additive white Gaussian noise (AWGN) with independent real and 
imaginary parts, each of variance cr^ (i.e., with overall noise power Nq = 2(T^). The SNR of the channel 
is also denoted as: 


P = 


A Es Es 


Nn 2ct2 ■ 


( 8 ) 


An integral step in the derivation of stochastic ML estimators and CRLBs consists in finding the LLF 
of the system. This requires marliginalizing the conditional (on the unknown symbols) LF over the 
constellation alphabet. In completely NDA estimation (or before data detection), no a priori information 
is available about the transmitted symbols. Therefore, the latter are usually assumed to be equally likely, 
i.e., with equal a priori probabilities (APPs). That is to say \/ Cm & Cp\ 

1 


P[a{k) = Cm] = 


M 


for /c = 1 , 2 , • • • , iF. 


(9) 


In CA estimation, however, the actual APPs of the transmitted symbols must be used in order to enhance 
the estimation performance as done in the next section. By doing so, we will ultimately express the 
LLF explicitly as function of the a priori LLRs of the individual coded bits. As will be explained later 
in Section IV-E, accurate estimates for the underlying LLRs can be obtained in practice from the soft 
outputs of the two SISO decoders at the convergence of the BCJR algorithm [32]. 
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III. Derivation of the LLF 

As widely known, the set of finite-energy signals usually denoted as: 

= |s(t) sueh that J \s{t)\‘^dt < +cxo| , 

form an infinite-dimension Hilbert subspaee [35] that ean be endowed with an orthonormal basis 
and an inner produet as follows: 


{si{t),S2{t)) = / Si{t)s2{tydt, V Si{t),S2{t) e Cl 


( 10 ) 


Therefore, an exaet diserete representation for any eontinuous-time signal s{t) E requires an infinite¬ 
dimensional veetor, s, that eontains its expansion eoeffieients, |s„ = (s(f), in the basis 

{^Pn{t)}n- To sidestep this problem, we first eonsider the iV-dimensional truneated representation veetors: 


T 

Vn = [vi, 1 / 2 , •• • , 1 / a ] , 

Wn = [wi,W 2 , . . .,WNf, 
xn{t) = [a;i(r), a;2(r),..., XN{r)f- 


( 11 ) 

( 12 ) 

(13) 


that eontain the orthogonal projeetion eoeffieients of y{t), w{t), and x{t — r), respeetively, over the first 
N basis funetions {^nit)}n=i (for any N > 1), i.e.: 


yn={y{t),(Pn{t)) = / y{t)(pn{tydt, 

Jr 

Wn= {w{t),(pn{t)) = / W{t)(pn{tydt, 

Jr 

Xn{T) = {x{t -T),(pn{t)) = / x{t - T)(pn{tydt, 


(14) 

(15) 

(16) 


Using (7) and (14) to (16), it follows that: 


yN = V^®a(t) -f Wn- 


(17) 


Due to the orthogonality of the basis funetions, it ean be shown that the noise projeetion eoeffieients, 
explieitly given by (15) are uneorrelated, i.e, 'E{wnwy] = 2a‘^5n,m- Henee, they are independent 
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since they are also Gaussian-distributed^ leading to wn ~ Af{0N,2a'^lN). Therefore, the pdf of the 
veetor in (17) eonditioned on the sequenee of transmitted symbols, a = [a(l), a(2),..., a(J’f)]^, and 
parametrized by r is given by: 

N 

p{yN\a;T) = (18) 

n=\ 

Note here that, although we do not show it explieitly, the transmitted symbols are indeed involved in 
(18) via the coeffieients {a:„(r)}„. After dropping the constant terms that do not depend explieitly on r 
in (18), we obtain the simplified truncated LF: 

{ N N \ 

n=l n=l J 

The eonditional LF whieh ineorporates all the information contained in the non-truncated veetor y [or 
equivalently the reeeived continuous-time signal y{t)\, is obtained by making N tend to infinity in (19). 
By doing so and using the Plancherel equality, we obtain the following conditional LF: 


A(y|a;r) = m{y{t)x{t-Ty]dt - ^ j \x{t-T)\‘^dt\ . 


( 20 ) 

Now, replacing the transmitted signal x{t) by its expression given in (4), and exploiting the fact that the 
shaping pulse, g{t), in (5) verifies the first-order Nyquist criterion (6), it ean be shown that: 


K 


^{y\a] r) = JJ 0^(a(/c), |/(t)), 


( 21 ) 


k=l 


where 


Vtr{a{k),y{t)) = e-x.pl^j^[y{t)a{k)*]h{t-kT-T)dt-^\a{k)\ 


( 22 ) 

The unconditional LF, A(y;T), is obtained by averaging (21) over all possible transmitted symbol 
sequences of size K, i.e., A{y;T) = Ea{A(y|a;r)} leading to: 


A{y,T)= ^ P[a = Ci]A{y\a = Ci-,T}. 


(23) 




^This is because they are obtained by some linear transformations (i.e., the orthogonal projection) of the original continuous-time white 


Gaussian random process w{t). 
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Under coded digital transmissions, a simplifying assumption is usually used in estimation practices, 
whether CA or NCA, in order to allow for tractable mathematical derivations of CRLBs and ML estimators 
of any parameter. This assumption postulates that the transmitted symbols are independent (cf. [15-30] 
and references therein) in spite of the statistical dependence between the coded bits that is introduced by 
channel coding. In fact, before even initiating the decoding process itself, the system needs to be fully 
synchronized by estimating the time delay, as well as, the phase and frequency offsets. Moreover, the 
decoder itself needs some estimates for other key channel parameters, e.g., the channel coefficient, noise 
variance, SNR, etc. All those estimates are obtained by applying traditional NDA estimators directly 
on the symbol-rate samples that are delivered by the matched filter before starting data decoding. As a 
matter of fact, in digital transmissions, all state-of-the-art NDA estimators (for any parameter, whether 
maximum likelihood or moment-based) are indeed based on the assumption of independent symbols 
although the latter are actually dependent due to channel coding. 

We emphasize, however, that exploitation of this assumption does not imply denying to exploit the 
dependence of the coded bits during the decoding process itself. Indeed, such dependence is exploited 
by the SISO decoders in order to output the estimates for the coded bits’ a posteriori LLRs. The latter 
are then used to decode the bits and also to compute their a priori LLRs (as explained later in Section 
IV-E) which are in turn used to evaluate the CA CRLBs and to find the CA TD ML estimate. Yet, even 
by assuming independent symbols (both in this paper and all existing works), it turns out that no much 
information is lost from the estimation point of view. In fact, the resulting CA estimation schemes achieve 
the ideal data-aided one (where all the symbols are perfectly known) over a wide range of practical SNRs 
where the completely NDA schemes do not (cf. Figs. 4 and 5 in this paper and the reported simulation 
results in other researchers’ works). Using the assumption of independent symbols it follows that: 

K 

P[a = Ci] = JJ [a{k) = Ci{k)]. 

k=l 


( 24 ) 
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Plugging (21) and (24) in (23), it can be shown that: 

K 

A{y,T) = En P[a{k) = Ci{k)]Qr{ci{k),y{t)) 

CiGC^ k=l 
K 

= P[a{k) = Cra]^r{Cm,y{t)). (25) 

Therefore, the unconditional log-likelihood function (LLF) defined as C{y;T) = ln(A(p;r)), is given 
by: 

K 

= (fifc(r,2/(f))), (26) 

k=l 

in which Ofc(r,|/(f)) is simply the average of VtT-{a{k),y{t)) over the constellation alphabet, i.e.: 


^k{T,y{t))= ^ P[a{k) = cJ^VLr{cm,yit)). (27) 

Cm G.Cp 

For ease of notations, we will hereafter no longer show the dependence of (lk{T,y{t)) on the received 
signal, y{t), and denote it simply as Ofc(r). Next, we will further manipulate this term and ultimately 
factorize it into two analogous terms which involve two independent and almost identically distributed 
RVs. In fact, by further denoting the top-right quadrant of the constellation as Cp, it follows that Cp = 
Cp U {—Cp) UCp U {—Cp). Thus, the sum over Cm G Cp in (27) can be equivalently replaced by a sum 
over each Cm G Cp and its three symmetrical points in the other quadrants. By doing so and noticing that 
\cm\ = I= |c^| = I —c^l, wc obtain from (22) and (27): 


^k{r) = 5^6 ip[a{k)=Cm\expl^ hR:{c*^y{t)}h{t-kT-r)dt 


+ P[a{k) = -Cm\ exp <! ^ / 'iR{-c*^y{t)}h{t-kT-r)dt 


+ P[a{k) = c*^] exp <1 ^ / 'iR{cmy{t)}h{t-kT-r)dt 


+ P[a{k) = -c*^] exp <j ^ I ^{-Cmy{t)}h{t-kT-T)dt 


(28) 
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Using a simple recursive scheme that allows the construction of arbitrary square-QAM constellations, it 
has been recently shown in [33] that the APPs for each symbol x{k) are expressed as follows (Vc^ e Cp): 


L2p-lW L2p(k) 

P[x(/c)=Cm]=/3fc/ifc,p(cm) e 2 6 2 , 

_£2p-J_W ^2p(fc) 

P[x{k)=C^=Pk ^^k,p{Cm) e 2 6 2 ^ 

-L2p-l(fc) L2p(k) 

P[x{k) = -Cm]=/3k IJ-k,p{^m) e 2 6 2 , 

I^2p-lW L2p(k) 

P[x{k) = -c^]=l3k fik,p{^m) e 2 6 2 , 


(29) 

(30) 

(31) 

(32) 


in which ^k,p{cm) and 13k are given by: 


2p-2 

/^fc,p(Cm) 6 

1=1 
2p 


h A n 


( 26 -- 1 ) 

1 


Liik) 


Cm ^ ^p 


(33) 

(34) 


2 cosh [Li{k)/2) 

Plugging (29)-(32) back into (28) and using the trivial identity + e~^ = 2cosh(x), it can be shown 


that: 


Ufc(r)=2/3fc^ lik,p{cm)e x 


coshj^ J^{cmyit)}h{t-kT-T)dt + W I, + 


cosh ^{d*^y{t)}h{t-kT-T)dt + 


(35) 


Furthermore, by using the relationship cosh(a;) + cosh( 2 /) = 2 cosh (^4^) cosh (4^) along with the two 
identities Cm + = 23?{cm} and dm — = 2jQ{cm}, it can be shown that (35) can be rewritten as 

follows: 


Ufr(r) =' 




-p|gp.Pcosh Uk (t) + X cosh 


(36) 


^k( ^ 

Cm ^Cp 

in which Mfe('r) and Wfc(r) are the matched-filtered in-phase and quadrature components of the received 


signal given by: 


“+00 


Uk{T) = / ^{y{t)}h{t — kT — T)dt, 


' —OO 
/•+00 


Vk{T)= 3s{y{t)}h{t - kT - T)dt. 


(37) 

(38) 
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Since in the Cartesian coordinate system of the eonstellation eaeh Cm G Cp can be written^ as Cm = 
[2i — l]dp + j[2n — l]dp for some 1 < i,n < 2^“^, then the single sum over Cm in (36) can be equivalently 
replaced by a double sum over the two eounters i and n as follows: 


—lop—1 


2P-i2P 




i=l n=l 

X 


X e 


\ CT"* 2 


(39) 


We also reeall the following decomposition recently shown in [33] for each Cm = [2i — l]dp+j\ln—l]dp G 




p- 


pfc,p([2^ l]dp -f J [2^1 l]cip^ ^fc,2p(^)^A:,2p— 1 (^) 1 


(40) 


where 


dk,2p (^) 


1=1 


p-l 

n 


(25Si,-i) 


^ 2 Z-l(^) 


(41) 

(42) 


1=1 

After using (40) in (39) and splitting the two sums, we obtain the following much useful factorization 
for Ofc(r): 


fifc(r) = 4:j3kFk,2p{uk{T)) Fk,2p—1 


(43) 


where 


2 P -1 

Fk,q{x) = y^Pk,qii)e~^^‘^'~^^^'^p cosh 

i=l 

in whieh g is a generie counter that is used from now on to refer to 2p or 2p — 1 depending on the 
context. Finally, by using (43) back in (26) and dropping the eonstant term AjSk that do not depend on 

^Note here that dp is half the minimum inter-symhol distance whose expression is given in [33, eq. (30)] explicitly as function of p for 


\/ Eg[2i—l]dp 


X + 


L,(fc) 


(44) 


normalized-energy constellations. 
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r, the useful LLF develops into: 


K 


K 




^ln(Ffc,2p(Mfc('r))j + '^ln(Fk,2p-i{vkir)fj 


(45) 


k=l 


k=l 


We succeeded here in decomposing the LLF into two analogous terms [the two sums in (45)] involving 


each either RVs Uk{r) or Mfc(r) that will be shortly shown to have almost the same distributions. This 


is actually the cornerstone result upon which we will establish the analytical expressions for the CA TD 
CRLBs in the next section. 

IV. Derivation of the CA CRLBs 

As an overall benchmark, the CRLB lower bounds the variance of any unbiased estimator, r, of the 
time delay parameter, i.e., E{(r — r)^} > CRLB(r). It is explicitly given by [36]: 



(46) 


where /(r) is the so-called Fisher information for the received data which is given by: 



(47) 


Using (45) in (47) and owing to the linearity of the partial derivative and expectation operators, it 
immediately follows that: 


K 



(48) 


k=l 


where 


lk, 2 p{r) = -E In (Ffc,2p(Mfc(r)))/(9r^} , 
7 fc, 2 p-i('r) = -E {0^ In (Ffc, 2 p-i(vfc(r)))/c)r^} . 


(49) 


(50) 


Before delving too much into details, we state the following result that is extremely useful to the derivation 
of the analytical expressions of the two terms 7 fc, 2 p(T) and 7 fc, 2 p-i(T). 
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Lemma 1 : Ukir) and ffc(r) are two independent RVs whose distributions are given by: 


Pbk{r)\ = ^^Fk,2p{uk{T))e 


ki-r 


pH{t)] = Fk,2p-i{vk{T))e 2 .: 




(51) 

(52) 


whith 


p-i 


h,2p — 


I 2 cosh [L2i{k)/2) ’ 
Pk,2p-i = n 2 cosh (L2 ;_i(A:)/2) 


1= 

p-i 


(53) 

(54) 


Proof: see Appendix A. 

As seen from (51) and (52), the two RVs Mfc('r) and nfe(r) are almost identically distributed (i.e., their pdfs 
have the same structure, but they are parametrized differently). Therefore, when evaluating the required 
expectation with respect to either Mfc(r) or ffc(r), equivalent derivation steps can be followed to find 
either 7 fc, 2 p('r) or 7 fc, 2 p-i('r). As such, we will only derive 7 fc, 2 p('r) and later deduce the expression of 
7fc,2p-i('r) by easy identification. To that end, we denote the first and second derivatives of Fk^ 2 p{x) in 
(44), with respect to the working variable x, by 2 p(^) -^fc, 2 p(^)’ respectively. We therefore establish 

the second partial derivative of In [Fk, 2 p{uk{T)')') with respect to the time delay parameter, r, as follows: 


|ihi(P„,(..(x))) = 7(r) 


r’('2p(“fcW) K%p{'^k(T)) 


+ Uk{r) 


■ k,2p \ 


fcM) 


Fk, 2 p{y-k{.r)) 


_Fk,2p{uk{r)) F^^^p{uk{T))_ 

in which 'Ufc(r) = duk{T)/dT and 'Ufc('r) = d'^Ukir)/. We further show in Appendix A that 'Ufc(r) 
and Uk{T) are two independent RVs as well. Thus, by applying the expectation operator to the previous 
equation, we obtain 7 fc, 2 p('r) as follows: 

K, 2 pi'>^k{r)) 


lk,2p{T) = E{M^(r)} 


E 


r’(,2p(“fc(^)) 




E 


^fc,2p(“feW) 


E lukir) 


Fk,2p{ukiT)) J 


(55) 


Fk,2p{uk{T)) ^ 

In the sequel, we will derive analytical the expressions for the four expectations involved in (55) separately. 
For convenience, we define beforehand the following two quantities (for q = 2p and 2p — 1) that will 
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appear repeatedly in the obtained expressions: 

uJk,q = 2 [3k,q cosh ^ 9k,q{i)dl{2i-lf, 


2P- 


2 = 1 
2P-1 


ak,q = 2 sinh 




6'fc,g(7)dp(2i - 1). 


(56) 

(57) 


2=1 


A. Derivation o/E{-u^(r)} 

In Appendix A, we show that : 

Mfc(r) = af?{a(/cO}^([fc - k']T) + 3fJ{ti;fc(r)}, 

A;'=l 

where 

p+oo 

Wki^) = — w{t)h{t — kT — T)dt. 


K 


(58) 


(59) 


Reeall here that the transmitted symbols are assumed mutually independent. As they are also independent 
from the derivative noise eomponents and exploiting the faet that E|ti;fc(r)} = 0 (sinee E|t(;(t)} = O), 
it ean be shown that E{M^(r)} is given by: 


E{Mfc(r)^} = Es 


K 


y^E{s{a(i)}''}9([i-;]r)'' 

l=l 
: K 

Y E{s{o(;)}}E{s{o(n)}}g(|4 - l\T)g(\k - n\T) 


1=1 n=l 
n^l 


+ 


E|3f?K(r)}'}. (60) 


The expected values of 3fJ{a(/c)} and 3fJ{a(/c)}^ involved in (60) are obtained by averaging them over all 
the points in the constellation alphabet, Cp, i.e.: 

E|3fJ{a(A:)}'}= y^P[a(A:) = c™]3fJ{c4l (61) 

Cm 

E|3fJ{a(A:)}}= y^P[a(A:) = c^]3fJ{c4. (62) 

Cm ^C-p 

Starting form (61) and resorting to some algebraic manipulations, we show in Appendix B that: 


E 


|3fJ{a(fc)}^ 


^k,2p‘ 


(63) 
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Using equivalent derivations, it can be also shown that: 

E|3fJ{a(A:)}}=«fc,2p. (64) 

In order to find the noise contribution through the derivative term in (60), we recall that the original 
continuous-time noise is assumed to be white, i.e., E{3fJ{u'(ti)}3fJ{t(;(f2)}} = — ^ 2 )- Therefore, 

starting from the expression of Wkir) in (59) and resorting to equivalent manipulations as in (109) of 
Appendix A, we obtain: 

E|3fJ|ti;fc(r)}^| = y h(t — kT — dt, 

= — cr^ / h{t — kT — — kT — T)dt, 

J* 

(65) 


Note here that, in line with the left-hand side of (65), the right-hand side of the same equation is indeed 
positive since ^(0) < 0. This is because the filter g{.) is convex in the vicinity of zero where it also 
attains its maximum. Now, using (63) to (65) in (60), it can be easily shown that: 

K / K 

ilk {Tf}=EsY^ (ui^2p - a?,2p) 9^{[^ - k]T) + Es I "^ai^2p 9 {[I- 
1=1 \i=i 




B. Derivation 0 / E | {El. 2 p{uk{.T)) /Ffc, 2 p(Mfc(r)))^| 

This is nothing but the expected value of a known transformation of the RV, Mfc(r), whose distribution 
was already established in (51). Therefore, it can be evaluated in closed form by integration over p[Mfc(r)] 
as follows: 


E< 


k,2p 




(t)) 


Fk,2p{uk{r)) 


\2p{^k{r)) 


n 


p[Mfc(r)]dMfc(r) 


M.^k,2p 


\/^ 


After using the explicit expression of the last equality is further simplified by using the 

variable substitution t = \/2uk{T)/a to obtain: 


E< 


K,2p{'>^k{r)) 

A:,2p ('^)^ 


= ^^k2p(p), 


(67) 
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where '^k, 2 p{-) in the last equality is given by: 


with 


e ‘idt, 

J-oo Ok,2p{t, p) 


2P-1 . 

h, 2 p{t,p) = X sinh \yp[2i-l]dpt + 

i=l ^ 


( 68 ) 


L2p{k) 

2 n 


2V- 


Sk,2p{t,p) = '^Ok,2p{i)e ^'"'^"^cosh(y^[2z-l]cipt + 


i=l 


L2p{k) 


C. Derivation of ¥. [F'l^p{uk{T)) /Fk, 2 p{uk{T))] 

This expeetation ean also be explieitly found by integrating over p\uk{T)\ in (51) to yield: 

k,2p \ _ f ^fc',2p (’’")) 


P'. 

E •' " 


Fk,2p{tJ’k{'r)) 


Fk,2p{ukiT)) 


■p[Uk{T)]dUk{T) 


‘^l^k,2p 




/ F!^,2p{Mr))e^^dukiT), 


(69) 


in which the second derivative of the function Tfc 2 p(-) defined in (44) is given by: 


ppjx) = ^ 


2P—1 

- lY 9 k, 2 p{,i)e~^^'^"~^^^‘^p x cosh f '/F 7 [ 2 j-i]dp ^_^L 2 ^\ _ 
1=1 ^ ^ 


After expanding (70) using the identity cosh(a: F y) = cosh(x) cosh(|/) + sinh(x) smh{y), plugging the 

-sl 

result back into (69) and then using the fact that sinh(x)e 2 is an odd function (i.e., its integral is 
identically zero), it follows that (69) is explicitly given by: 

£ j ^k,2p (Fk)) 


Fk,2p 


2P- 


0+00 


= (^) E (2* - / cosh 


2=1 


•/E^[2i—l]dp ^ 
^2 


-'d-kF) 


Uk{r)]e 2 <t 2 duk{r). 


(71) 


Moreover, we show via “integration by parts”, the following equality for any a > 0 and 6 G M: 


r’+CO 


cosh(fex)e dx = 


(72) 



















20 


which is used in (71), with the appropriate identifieations, to yield the following elosed-form expression 
for the expectation in (69): 


p 1 2a;fc,2p 


(73) 


D. Derivation of E {Mfe(r)F^ 2p(“fc(^))/7^fc,2p(Mfc(r)) } 


To find this expectation, we use a standard approach in which we first find the expeetation eonditioned 
on Mfc(r) and then average the obtained result with respect to Mfc(r). By doing so, we obtain: 

In order to find ¥,uf,\uJfi'k{T)\uk{T')\ in (74), we must find the explieit expression of 'iifc(r) as funetion of 

Mfc(r). In faet, it is easy to show that: 

/ + 00 

— kT — T)dt 

■CO 

K 

= V^5^3?{a(0}^([/-fc]T) +3?{^fc(r)}. (75) 

1=1 

Moreover, from (100) and (102) in Appendix A, we readily have: 


ui{t) = ^^)R{a(/)} + )R{M;i(r)}. 


(76) 


Therefore, 3f?{a(/)} = [ui{t) — 3f?{tei(r)}] which is used in (75) to obtain: 

K 

Uk{r) = '^\ui{t)-'R{wi{t)} g{[l-k]T)+'R{wk{T)}. (77) 


1=1 


Now, sinee E{3?{M}fc(r)}} = E{3fJ{tefc(r)}} = 0 and since the RVs {ui{t)}i are mutually independent, 
it follows that: 


K 


E{iifc(r)|Mfc(r)} = Uk{T)g{0)+ '^E{ui{t)] g{[l-k]T). 

1=1 
l^k 

But owing to (76) and (64), it immediately follows that: 


(78) 


E{Mz(r)} = v%E{3?{a(0}} = V%«fe,2p- 


(79) 
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Using (79) in (78) and then plugging the obtained result back into (74), we obtain: 


E 


= g(0)E (I + I 

Fk,2p\Uk(T)j J ( Fk^2p{ukir)) ] [ Fk, 


K 


,2p(^Wfc(r)j J ^ 
l^k 


Fk,2p (uk{T)j 

As done previously, the two expectations in (80) are derived in closed-form by integration over the 
distribution, p[uk{T)~\, already established in (51). The final results are given by: 


e|«.(t)P 4 P 4 I = 

( Fk,2p[uk{'rj) J 


E 


^2p,c(^(k)) ] ^ 

^2 ^k,2p') 


F2p,OL 1 


cW) J 


(81) 

(82) 


which are used in (80) to yield: 


E iifc(r 


^k,2p (^fc(''~)) 1 _ 


= 2p 


^^fc,2p^(0) +ak,2p ai,2pg{[l - k]T) 

l^k 


(83) 


Fk,2p ip^k ('^)^ 

Finally, by injecting (66), (67), (73), and (83) back into (55), the analytical expression of 7 fc, 2 p('r) is 
obtained as: 


7fc,2p(r) = u:k,2p - ^k,2p{p) 


K 


K 


(w;,2p - al2p)9‘^{[l - k]T) + I '^ai^2pg{[l-k]T) 

-2p 


i=i 


. 1=1 


^fc,2p(p)^(0) - ak,2p'^ai^2p9{[l - k]T) 

l^k 


.(84) 


Due to the apparent symmetries between the distributions of the two RVs Mfc(r) and Vkir), the analytical 
expression of 7 fc, 2 p-i('r) can be directly deduced from the one of 7 fc, 2 p('r) by easy identifications as: 


7fc,2p-i('r) — 4p^ ujkpp-i — 4ifc^2p-i(p) 


K 


K 


'^(ui^ 2 p-i-al 2 p_i]g^{[l-k]T) + '^ai^ 2 p-i g{[l-k]T)\ 


1=1 


. 1=1 


-2p 


^fc,2p-i(p) ^(O) + ak,2p-i'^ai^2p-ig{[l - k]T) 

l^k 


(85) 


The closed-form expression for the TD CA CRLB is then obtained as the inverse of the Fisher information 
given by (48), i.e.: 


CRLB(r) = ^-. (86) 

Ef=i7..2p(r)+7.,2p-i(r) 

It is worth mentioning here that the turbo-code setup is not needed in our derivations and that the new 
CA CRLB expression (86) is actually valid for any coded system in general. In fact, we have so far 
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only exploited the faet that the eonstellation is Gray-eoded and we have expressed the CA TD CRLBs 
explicitly in terms of the coded bits’ a priori LLRs. Yet, we will explain in the next subsection how 
these unknown LLRs are obtained from the output of the SISO decoders in a turbo-coded system. Yet, 
they can also be obtained from LDPC-coded systems in the very same way if the latter are decoded 
with the turbo principle [37], [38] (i.e., MAP or BCJR decoder). In this case, the so-called check nodes 
(C-nodes) and variable nodes (V-nodes) [37] play the very same role as SISO decoders in turbo-coded 
systems. 

E. Evaluation of the analytical CA CRLBs 

In order to compute and plot the new CA CRLBs, one needs to evaluate the coefficients Uk^q and ^ 
for q = 2p and q = 2p — 1. These coefficients are, however, functions of the a priori LLRs, Li{k), as 
seen from (56) and (57). In the sequel, we briefly explain how these LLRs can be obtained from the 
output of the SISO decoders at the convergence of the BCJR algorithm. First, the MF returns a sequence 
of K symbol-rate samples: 



(87) 


where (cf. Appendix A): 


yk{r)= y{t)h{t - kT - T)dt = s/Ws a{k) + Wkir) 


( 88 ) 


— OO 


Then, the soft demapper extracts the so-called bit likelihoods: 



(89) 


for all the code bits and feed them as inputs to the turbo decoder. By exchanging the so-called extrinsic 


information between the two SISO decoders, the a posteriori LLRs of the code bits: 



(90) 
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are updated iteratively aeeording to the turbo prineiple. We denote their values at the turbo iteration 
as T\^\k). After say R turbo iterations, a steady state is achieved wherein T\^\k) T;(fc), for every 

I and k, and their signs are used to detect the bits. Yet, owing to the well-known Bayes’ formula, we 
have: 


P[6f = l|y(r)] 


p[yir)\bf = l]P[bf = 1] 

p[y(^)] 


(91) 


and 


F[6f = 0|y(r)] 


P[yi'r)\bf = 0]P[bf = 0] 

p[y(^)] 


(92) 


Therefore, by taking the ratio of (91) and (92) and applying the natural logarithm, it immediately follows 
that: 


Li{k) = Tiik) - Alik) ^ Tf\k) - Alik), (93) 

meaning that the required a priori LLRs of the code bits can be easily obtained from their steady-state 
a posteriori LLRs and A;(fc) already computed by the soft demapper prior to data decoding. 


V. New Time Delay CA ML estimator 

As mentioned previously, the timing recovery task is integrated within the turbo iteration loop. But in 
order to initiate the turbo decoding process itself, the latter needs some preliminary information-bearing 
symbol-rate samples. The latter can be obtained at the output of the MF (corrected with t^vil-nda) where 
Ail-nda is the NDA MLE for the TD parameter estimated as: 


tml-nda = argmax£(°)(r), (94) 

T 

where £^°^(.) is the NDA LLF obtained directly from its CA counterpart in (45) by setting"* Lfk) = 0 
for all I and k, i.e.: 


C(°)(r) 


K-l 

^ [in (F(Mfc(r)))+ln (F(i;fc(r))) 

k=0 


(95) 


"^In the NDA case (i.e., before starting data decoding), no a priori information about the bits is available at the receiver end, i.e., 
P[bf = 0] = P[bf = 1] = 1/2 and thus Li{k) = 0 for all I and k. 
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in which F{.) is simply given by: 

2P-1 

i=l 

The iterative algorithm that maximizes (r) with respeet to r in (94) will be detailed at the end of this 
section. Note also that Ukir) and Vkir) involved in (95) are the real and imaginary parts of a discrete-time 
MF output that is obtained as follows. At the receiver side, y{t) is upsampled using a sampling period 
Tg <T/{1 + (3) with [3 being the roll-off faetor to obtain: 

K 

yi 4 y{lTg) = h{lTs -kT-T)+ w{lTg). 

k=l 

These high-rate samples are then passed through a discrete-time MF to obtain the symbol-rate samples: 

yk{T)=yi^ h{lTs-kT-r) = ^ yi h{lTs-kT-T)dt, 

i 

from whieh we obtain Mfc(r) = 3?{2/fc(r)} and ffc(r) = ^{ykir)} whieh are used in (95). Onee tivil-nda 
is aequired, the eorresponding sequence of symbol-rate samples: 

y('ny[L-NDA) = [l/l('rML-NDA), l/2('rML-NDA), • • • ; 2/ir('rML-NDA)] ; 

is passed to the soft demapper in order to find the bit likelihoods required to start the deeoding proeess. 
To exploit the output of the decoder and better re-synchronize the system, at a per-turbo-iteration basis, 
we modify (93) as follows: 

Lf''(k) = T^;'\k)-A'r'\k), (96) 

in order to obtain a more refined TD estimate, after each turbo iteration as will be explained 

shortly. Note here that A\''~^\k) are the bit likelihoods that are obtained after re-synchronizing the system 
with r^j^^cA’ the TD estimate eorresponding to the previous turbo iteration. These are fed to the SISO 
deeoders to eompute an update for the a posteriori LLRs, T|^^(/c), at the eurrent turbo iteration. The 
refined TD MLE is thereof obtained as: 

e!£-ca = argmax £(^)(r), 


-piV.d2[2i_i]2 ^2S\2i-2h/N^ 


(97) 
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where is the CA LLF in (45) evaluated using Lf\k) instead of Li{k), i.e.: 

k=l 

in whieh is given by: 

2P-1 

cosh 

i=l 

for q = 2p and 2p — l. Here, ^[lp{i) and are also obtained by using L‘f\k) instead of Li{k) 

in (41) and (42), respeetively. 

A key detail that is still missing needs to be addressed here as how the NDA and CA LLFs are maximized 
in (94) and (97). Aetually, sinee these LLFs were derived in closed-form expressions, they can be 
easily maximized using any of the popular iterative techniques such as the well-known Newton-Raphson 
algorithm: 

(98) 

in which is the TD update pertaining to the Newton-Raphson iteration. The algorithm stops once 
the convergence criterion < e is met^ to produce Tml-ca *^he CA TD MLE during the turbo 

iteration. Note, however, that the Newton-Raphson algorithm itself is iterative in nature and, therefore, 

■^(r) 

requires a reliable initial guess, Tq , to ensure its convergence to the global maximum of the underlying 
objective LLF. At each turbo iteration, the algorithm is initialized by = Tml^ca *^he TD 

MLE pertaining to the previous turbo iteration). At the very first turbo iteration, however, the algorithm 
is initialized with the NDA MLE, Tml-ndaj obtained in (94). The latter is obtained by maximizing £*^°)(r) 
itself via the very same Newton-Raphson algorithm and the corresponding initial guess is obtained by 
a broad line search over r. Eor better illustration, Eig. 1 depicts the architecture of the newly proposed 
CA ML timing recovery algorithm. 


/a2£«(r)' 

\ ^<9£W(r) 

_V 

/ dr 



®Note here that e is a predefined threshold that governs the required estimation accuracy. 
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y{t) 



r'* Turbo Decoding iteration 


Fig. 1. Flowchart of the new CA TD ML estimator. 

VI. Simulation Results 

In this section, we provide some graphical representations of the new TD CA CRLBs for different 
modulation orders and different coding rates. We also analyze its computational complexity and compare 
it to that of the existing sum-product expectation-maximization (SP-EM) timing recovery algorithm [25]. 
The encoder is composed of two identical RSCs concatenated in parallel, having generator polynomials 
(1,0,1,1) and (1,1,0,1), and a systematic rate Rq = ^ each. The output of the turbo encoder is punctured 
in order to achieve the desired code rate R. For the tailing bits, the size of the RSC encoders memory is 
fixed to 4. We consider a root-raised-cosine (RRC) signal with roll-off factor a = 0.2. We also consider 
QPSK and 16-QAM, as two representative examples of square-QAM constellations, and two different 
coding rates, namely R = ^ and R = I- 

We begin by verifying in Figs. 2 and 3 that the new analytical CA CRFBs coincide with their empirical 
counterparts obtained previously in [29] from exhaustive Monte-Carlo simulations. In fact, unlike our 
closed-form solution, an extremely large number of noisy observations was generated in [29] in order 
to find an empirical value for the expectation involved in the Fisher information (47). Hence, our new 
analytical expression corroborates these previous attempts to evaluate the underlying TD CA CRFBs 
empirically and allow their immediate evaluation for any square-QAM turbo-coded signal. 
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Fig. 2. Comparison between the empirical and analytical CA CRLBs for different code rates, R, as function of the SNR: QPSK, rolloff 
= 0 . 2 . 



Fig. 3. Comparison between the empirical and analytical CA CRLBs for different code rates, R, as function of the SNR: 16-QAM, rolloff 
= 0 . 2 . 

As expected, we also see from both figures that the CA CRLBs are smaller than their NDA counterparts. 
This highlights the performance improvements that can be achieved by a coded system over an uncoded 
one by exploiting the information about the transmitted bits that is obtained from the SISO decoders. 
Additionally and most prominently, the CA CRLBs decrease rapidly and reach the DA CRLBs which are 
the best bounds ever one would be able to achieve if all the transmitted symbols were perfectly known 
















28 


to the receiver, hypothetically. 

In the sequel, we also assess the performance of the new TD CA ML estimator using the normalized 
(by T^) mean square error (NMSE) as a performance measure: 



NMSE 


(99) 


where *^he estimate of r generated from the Monte-Carlo run for m = 1, 2 ..., Me. In Eigs. 


4 and 5, we plot the NMSE of the new estimator for QPSK and 16-QAM transmissions obtained from 
Me = 5000 Monte-Carlo trials, and benchmark the resulting performance curves against the corresponding 
new CA CREBs. To illustrate the performance advantage brought by CA estimation as compared to NCA 
estimation (from the algorithmic point of view), we also plot in the same figures the NMSE of the NDA 
TD ME estimator (94). Eigs. 4 and 5 show that the potential estimation performance gains (attributed to the 
decoder’s assistance) made predictable now theoretically by the CA CREBs can be achieved practically 
by the newly proposed CA ME estimator. More interestingly, the new estimator almost reaches the CA 
CREB over the entire practical SNR range confirming thereby its statistical efficiency. 



SNR [dB] 


Fig. 4. NMSE of the new CA ML estimator for different code rates, R, as function of the SNR: QPSK, rolloff = 0.2. 


In the same figures, we can also observe unambiguously the effect of the coding rate, R, on CA estimation 
performance. Even though the same NMSE levels are achieved at relatively high SNRs for i? = ^ and 
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Fig. 5. NMSE of the new CA ML estimator for different code rates, R, as function of the SNR: 16-QM, rolloff = 0.2. 

R = ^, the estimator performs quite differently for the two rates at the same SNR values. In fact, with 
smaller coding rates, more redundancy is introduced by the encoder and, hence, the decoder becomes 
more likely able to correctly detect the transmitted bits, thereby enhancing the estimation performance. 
Now, if we turn the tables and assess the effect of modulation order on estimation performance at the 
same coding rate, we observe without any surprise that it deteriorates with larger constellations at any 
given SNR level. This typical behavior was already observed in NDA estimation and, as a matter of fact, 
in any parameter estimation problem involving linearly-modulated signals. Indeed, when the modulation 
order increases, the inter-symbol distance decreases for normalized-energy constellations. As such, at the 
same SNR level, noise components have a relatively worse impact on symbol detection and parameter 
estimation in general. 

Finally, we compare the new CA ML TDE to the existing SP-EM ML-based algorithm both in terms 
of estimation performance and computational complexity in Figs. 6 and 7, respectively. 

In Fig. 6, even though both estimators perform nearly the same with QPSK signals over the entire SNR 
range, we observe with 16-QAM a clear advantage of the new CA ML TD estimator over SP-EM at 
low SNR levels. The superiority of the proposed estimator over SP-EM can be even better appreciated 
when it comes to computational complexity. In fact, we plot in Eig. 7-(a) the total number of operations 
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(a) (b) 



Fig. 6. NMSE of the new CA ML estimator and SP-EM for different code rates and modulation orders, rolloff = 0.2. 


W (b) 



Fig. 7. complexity of the new CA ML estimator and SP-EM for different code rates versus the modulation order: (a) total number of 
operations, and (b) complexity ratio. 


(i.e., additions, multiplications, and divisions) required by both estimators versus the modulation order. 
There we ean see that the new CA ML estimator entails mueh lower eomputational load. The ratio of 
eomplexities depieted in Fig. 7-(b) suggests, indeed, that the proposed estimator is about 30 and 70 times 
eomputationally less expensive than SP-EM for 64- and 256-QAM, respeetively. 
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VII. Conclusion 

In this paper, we derived for the first time the closed-form expressions of the Cramer-Rao lower 
bounds for code-aided symbol timing estimation from turbo-coded square-QAM transmissions. The new 
CA CRLBs revealed the huge performance improvements in terms of timing recovery are achievable by 
exploiting the soft information delivered by SISO decoders at each turbo iteration. The new analytical 
CRLBs coincide exactly with their empirical counterparts established in previous pioneering works on the 
subject but from exhaustive Monte-Carlo simulations. We also developed a new code-aided ML time delay 
estimator that is able to achieve the potential performance gains made thoroughly and instantly predictable 
by the new closed-form CA CRLBs. The new estimator also exhibits a remarkable advantage in terms 
of computational complexity as compared to the most powerful ML-type algorithm that exists in the 
literature, namely SP-EM. Simulations results also show, as intuitively expected, that the CA estimation 
performance improves by decreasing the coding rate, i.e., increasing the amount of redundancy. 

Appendix A 

A.l) Proof of Lemma 1: 


In order to find the pdfs of Mfc(r) and Pi(r) defined in (37) and (38), respectively, and prove that 
they are two independent RVs, we define the following proper complex RV: 


“+00 


yk{r)= y{t) h{t - kT - T)dt = Ukir) + jvkir), 


( 100 ) 


which verifies p[i/fc(r)] = p[uk{T),Vk{T)]. Moreover, replacing y{t) by its expression given by (4) in 
(100) and resorting to some easy algebraic manipulations, we obtain: 


K 


"+00 


Vkir) = •/E.J2 a{k') / h{x)h{x + [k'— k]T^dt + Wk{T), 


k'=l 


9{[k'-k]T) 
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where tefc(r) is the filtered noise eomponent, i.e.: 

/ + CXD 

w{t)h{t — kT — T)dt. (101) 

■CX) 

Reeall that the shaping pulse g{t) verifies the first Nyquist eriterion stated in (6), i.e., g{[k' — k]T) = 
6{k' — k), thereby leading to: 

ykir) = \/Wsa{k) + Wkir). ( 102 ) 


Further, it ean be verified from (101) that Wkir) is Gaussian distributed with zero-mean and varianee 
2(7^. Henee, the pdf of |/fc(r) eonditioned on a{k) is also Gaussian; i.e., Vcm G Cp we have: 

p[yk{T)\a{k) = Cm] =^exp^^-^\yk{T)-^/Kcmf^ ■ 

After expanding the modulus in the exponential argument, it ean be easily shown that y Cm & Cp we 
have: 

p[yk{T)\a{k) = Cm] = 0^(cm,|/(t)), (103) 

where Clr(cm,y{t)) is given in (22). Then, by averaging over all the eonstellation points in Cp and 
reealling the expression of Clk{T) in (27), the pdf of ykir) is obtained as: 

|yfc(^)l^ - 

2." Clkir). (104) 


Finally, using the faetorization of Clkir) obtained in (43) along with ||/a;(t)P = uI{t) -f n|(r) and 


Pk = Pk, 2 pPk, 2 p-i, it follows that: 


p[yk{T)] = 


_ ^^k,2p(^k,2p — l p, 




27r cr^ 


e 272 Fk, 2 p{uk{T))Fk, 2 p-i{vk{r)) 


vI(t) 


n? (r) _ ^ 

_ 3^e~^Fk,2p{ukiT)) ^^^-e~^Fk,2p-i{vk{T)). 




Pluk{r)] Pl-Ck{r)] 

From the last equality, we obtain p[|/fc('r)] = p[ukiT)]p[vk{T)]. But sinee from (100) we already have 


ykir) = Ukir) + jvkir), then we also have p[ykiT)] = p[uk{T),Vk{T)]. Therefore, it follows that 
p[uk{T),Vk{T)] = p[uk{T)]p[vk{T )], meaning that the two RVs Ukir) and Wfc(r) are actually independent 
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and their distributions are, respeetively, given by (51) and (52). 


A.2) Statistieal Independence of Mfc(r) and iikir): 


First, it follows from (100) that: 

Uk{T) = = - j } h{t -kT - T)dt. (105) 

Again, we replace y{t) by its expression given in (4) and then we use the fact that g{.) is an odd function 
to show that: 

K 

Mfc(r) = (106) 

k'=l 

where Wkir) is the derivative of Wkir) with respect to r, which is obtained by replacing h{t — kT — t) 
by —h{t — kT — r) back in (101). Recall also that ^(0) = 0 (since the maximum of g{x) is located at 
0), leading to: 

K 

Mfc(r) = - k']T) + 3f?{ti;fc(r)}. (107) 

k'=l 

k'^k 

Recall also from (100) that Mfc(r) = 5R{|/fc(r)} and, therefore, we have from (102) : 

Ukir) = ^/Ws^{a{k)} + ^{wkir)}. (108) 

Notice from (107) that Mfc(r) involves the contribution of all the symbols except the one [i.e., a(/c)] that 
is, in turn, the only one involved in Mfc(r) as seen from (108). Since the symbols are mutually independent, 
then in order to show the independence of Mfc(r) and Uk{T), it suffices to show the independence of Wkir) 
and Wkir). These are actually two RVs that are obtained from linear transformations (i.e., integral and 
derivative) of the same Gaussian process w{t) and, hence, they are also Gaussian distributed. Their 




34 


cross-correlation is given by: 


-1-00 


E {wk{T)wk{T)} = / / E{w{ti)w{t 2 )}h{ti — kT — T)h{t 2 — kT — T)dtidt 2 


+CX) 


= 2cr^ / / — t2)h{ti)h{t2)dtidt2 

J J —OO 

= 


= 0 , 


(109) 


meaning that the two Gaussian-distributed RVs Wkir) and Wkir) are uneorrelated and, therefore, inde¬ 
pendent as well. Consequently, Mfc(r) and Mfc(r) are also independent. 


Appendix B 


Using the decomposition Cp 

^CmY = 


= CpU {—Cp) U C* U {—€*) and noticing that: 

= = V4. 


£ C„ 


we rewrite (61) as follows: 

E|3?{a(A:)}'} = (p[a{k) = c^]+P[a{k) = -c^] + P[a{k)=dl]+P[a{k) =(110) 

Cm 

Moreover, by using the explieit expressions of the symbols’ APPs given in (29)-(32), along with the 
identity cosh(a;) -f cosh(2/) = 2cosh(^|^) cosh(^^), we obtain: 


Pr [a{k) = Cm] +Pr [a{k) 


-Cm] +Pr [a{k) = c*^] +Pr [a{k) = -c*^] 

= ‘2/3kfik,p{cm) cosh j q_cosh 

= 4/3fc/4fc,p(cm)cosh (^^^^cosh f 


f L2pik)—L2p-i(k) \ 

I 2 > 

(111) 


Now, plugging (111) baek into (110), rewriting the sum over Cm G Cp as a double sum over the eounters 
i and n [where Cm = (2i — l)dp -f j(2?7, — l)dp as done in (39)], and using the deeomposition in (40), it 
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can be shown that: 

2P-1 2P-1 

E|3?{a(fc)}^| = 4/3fc ^ ^ J(2i - lfdlek, 2 p{i) 0 k, 2 p-i{n) x cosh j cosh 

i=l n=l 

2P-1 2P“l 

= 2/3fc,2pCosh - l)^dp6'fc,2p(*) x 2/3fc,2p-icosh ^ 6>fc,2p-i(ri), 

2=1 72=1 

( 112 ) 

where the decomposition 13k = f3k,2pldk,2p-i was used in the last equality as well. Moreover, it has been 
recently shown in [34, LEMMA 3] that for q = 2p and 2p — 1: 

2P-1 

2/3fc,gCosh ^6>fc,g(n) = 1, (113) 

72=1 

which is used back in (112) to obtain the following result: 

E|3?{a(fc)}'| = ujk,2p. (114) 
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